Exploring behavior states in animal movements

Chloe Bracis

What is the animal doing? Tools for exploring behavioural structure in animal movements

Eliezer Gurarie, Chloe Bracis, Maria Delgado, Trevor D. Meckley, Ilpo Kojola and C. Michael Wagner

Journal of Animal Ecology, 2015

Overview of workshop

  • Questions

  • Tools

    • First passage time (metric-based)
    • Bayesian partitioning of Markov models (classification)
    • Behavioral change point analysis (phenomenological)
    • Mulitstate random walk models (mechanistic)
    • Expectation-Maximization Binary Clustering (classificaion)
  • Application

    • Simulated data
    • Real data

Questions

Explotatory

  • What is the animal doing?

Explanatory

  • Why is the animal doing what it’s doing?

Predictive

  • Can we anticipate the movement response of the animal?

Supervised vs unsupervised classification

Supervised

  • User provides information to direct classification
    • Training data
    • Bounds
    • Number of classes

Unsupervised

  • User selects algorithm, but does not otherwise aid classification process
  • User knowlege to later relate classes to domain-specific catagories

Make use of information you have about the system!

Tools

Tools

Example data: Lamprey

require(waddle)
data(Lamprey)

# smooth track for GPS error when stationary
Lamprey.smooth = SmoothTrack(Lamprey, 3)

plot of chunk unnamed-chunk-2plot of chunk unnamed-chunk-2

First passage time

plot of chunk unnamed-chunk-3

FPT - time it takes for an individual to enter and leave a circle of fixed radius r drawn around each location

First passage time

Parameter to specify: radius

require(adehabitatLT)

Lamprey.traj = as.ltraj(data.frame(Lamprey$X, Lamprey$Y), Lamprey$Time, 
                   id = "Lamprey")
Lamprey.smooth.traj = as.ltraj(data.frame(Lamprey.smooth$X, Lamprey.smooth$Y), Lamprey.smooth$Time, 
                   id = "Lamprey")

radii = c(5, 10, 20)
Lamprey.fpt = fpt(Lamprey.traj, radii)
Lamprey.smooth.fpt = fpt(Lamprey.smooth.traj, radii)

FPT: output

plot of chunk unnamed-chunk-5

FPT: extensions

Other implementations:

  • Variance in log(FPT) high at scales where there are patterns (Fauchald & Tveraa 2003)
  • Residence time (RT) - allow animal to leave & return (Barraquand & Benhamou 2008)
  • Partitioning algorithms for behavioral phases (Lavielle 1999, 2005)
    • other partitioning/clustering algorithms (e.g., mixtools package)

FPT: partitioning

plot of chunk unnamed-chunk-6

FPT: partitioning

plot of chunk unnamed-chunk-7

Bayesian partitioning of Markov models

  • Classisfication method
  • Movement composed of discrete number of homogeneous processes (candidate models)
  • Markovian transitions between models
  • Randomized likelihood-based method
    • choose the number of candidate models
    • partition the time series into a specific sequence of those models
  • See modpartltraj in adehabitatLT and (Guéguen 2001, 2009)

Bayesian partitioning of Markov models

  1. Regularize the data (no gaps)
  2. Select movement variable (e.g., step length)
  3. Define set of candidate models (e.g., Gaussian models with range of means and constant variance)
  4. Select optimal number of models (randomized likelihood)
  5. Partition track (assign model for each partition)

Assumptions:

  • step lengths are independent between observations

Lamprey BPMM

First, regularize data:

Lamprey.smooth.reg = InterpolatePoints(Lamprey.smooth, 2, "min")$Data
Lamprey.smooth.traj = as.ltraj(data.frame(Lamprey.smooth.reg$X, Lamprey.smooth.reg$Y), 
                   Lamprey.smooth.reg$Time, id = "Lamprey")

BPMM: step lengths

  • Need to pick within-segment standard deviation
  • Difference in variance -> may need to transform data plot of chunk unnamed-chunk-9

BPMM: How many partitions?

See modpartltraj in adehabitatLT

Lamprey.smooth.segments = Prep.segments(Lamprey.smooth.traj, units = "min", 
                            sd = 5, nmodels = 20)

plot of chunk unnamed-chunk-10

Maximum likelihood for K =  4 

BPMM: Partition data

See partmod.ltraj in adehabitatLT

Lamprey.smooth.partition = Partition.segments(Lamprey.smooth.segments)
plot.segments(Lamprey.smooth.partition, 
              xlab="", ylab="Step length (m)")

plot of chunk unnamed-chunk-11

BPMM: Diagnostics

DiagPlot.segments(Lamprey.smooth.partition)

plot of chunk unnamed-chunk-12

Behavioral Change Point Analysis

  • based on time series analysis
  • select response variable
    • e.g., persistence velocity = speed * cos(turning angle)
  • sweep window across time series and identify most likely change point in window
  • model response as continuous-time autoregressive Gaussian process with mean, standard deviation, and characteristic time-scale of autocorrelation

Behavioral Change Point Analysis

Smooth

  • estimates parameters for each data point averaged over all windows

Flat

  • finds the most frequently chosen change points and estimates the parameter values within each phase
  • comparable to BPMM

Behavioral Change Point Analysis

  • can use irregularly sampled data
  • don't pre-specify number of states
  • set several tuning parameters
    • sensitivity parameter K (penalty for extra params)
    • window size w (corresponds to time scale change point may occur)
    • cluster width for flat BCPA (how many neighbors pooled)

Lamprey BCPA

Format data and set tuning parameters

Lamprey.VT = GetVT(Lamprey.smooth, units = "min")
windowsize = 30
windowstep = 1
K = 0.5

Choose function of data columns for analysis

Lamprey.ws = WindowSweep(Lamprey.VT, "V*cos(Theta)", 
                          windowsize, windowstep, 
                          plotme = FALSE, K = K, 
                          tau = TRUE, progress = FALSE)

Lamprey Smooth BCPA

plot(Lamprey.ws, type="smooth", threshold = 2, 
     legend = FALSE)

plot of chunk unnamed-chunk-15

Lamprey Flat BCPA

plot(Lamprey.ws, type="flat", clusterwidth = 3, 
     legend = FALSE)

plot of chunk unnamed-chunk-16

Lamprey Smooth BCPA Diagnostics

DiagPlot(Lamprey.ws, "smooth")

plot of chunk unnamed-chunk-17

Lamprey Flat BCPA Diagnostics

DiagPlot(Lamprey.ws, "flat")

plot of chunk unnamed-chunk-18

Multistate Random Walk

  • Animal transitions between several states
  • Each state unique parameterization of movement model
  • Transitions between states
    • independent
    • Markovian with probability \( p_{ij} \) of switching from state i to j

Multistate Random Walk

Movement model: correlated random walk (CRW)

  • Turning angles: circular distribution (e.g. wrapped Cauchy)
  • Step lengths: unimodal positive distribution (e.g. Weibull)

Multistate Random Walk

Assumptions:

  • Regular data
  • Specify number of states
  • Specify movement model

Model fitting:

  • Bayesian MCMC using JAGS, stan, etc.
  • Maximum likelihood using moveHMM

Lamprey two state MRW

Movement:

  • steps ~ Weibull(\( shape_i \), \( scale_i \))
  • angles ~ wrapCauchy(\( mu_i \), \( rho_i \))

State switching:

  • switching probabilities 2x2 matrix P

MRW: Model fitting steps

  1. Prep data
  2. Set parameter start values
  3. Fit model
  4. Evaluate fit

MRW: Data preparation

require(moveHMM)
Lamprey.hmm = prepData(Lamprey.smooth, type = "UTM", coordNames = c("X", "Y"))

plot(Lamprey.hmm, compact = TRUE)

plot of chunk unnamed-chunk-19plot of chunk unnamed-chunk-19

summary(Lamprey.hmm)
Movement data for 1 animal:
Animal1 -- 430 observations

Covariate(s): 
 Time 
                      Min.                        25% 
"2010-05-02 14:50:40 CEST" "2010-05-02 17:32:15 CEST" 
                    Median                       Mean 
"2010-05-02 20:13:20 CEST" "2010-05-02 20:22:12 CEST" 
                       75%                       Max. 
"2010-05-02 23:15:30 CEST" "2010-05-03 02:05:20 CEST" 

 Z 
           Min.             25%          Median            Mean 
728016+5043025i 728631+5042149i 728632+5042149i 728671+5042191i
            75%            Max. 
728639+5042148i 729223+5041914i

 Depth 
     Min.       25%    Median      Mean       75%      Max. 
0.2926667 5.0550000 5.0550000 4.4891907 5.0550000 5.7146667 

MRW: Data preparation

summary(Lamprey.hmm)
Movement data for 1 animal:
Animal1 -- 430 observations

Covariate(s): 
 Time 
                      Min.                        25% 
"2010-05-02 14:50:40 CEST" "2010-05-02 17:32:15 CEST" 
                    Median                       Mean 
"2010-05-02 20:13:20 CEST" "2010-05-02 20:22:12 CEST" 
                       75%                       Max. 
"2010-05-02 23:15:30 CEST" "2010-05-03 02:05:20 CEST" 

 Z 
           Min.             25%          Median            Mean 
728016+5043025i 728631+5042149i 728632+5042149i 728671+5042191i
            75%            Max. 
728639+5042148i 729223+5041914i

 Depth 
     Min.       25%    Median      Mean       75%      Max. 
0.2926667 5.0550000 5.0550000 4.4891907 5.0550000 5.7146667 

MRW: fit two state model

weibullShape = c(3, 0.5)
weibullScale = c(10, 10)
wcauchyMu = c(0, pi)
wcauchyRho = c(0.7, 0.2)

doubleStateModel = fitHMM(Lamprey.hmm, nbStates = 2, 
            stepPar0 = c(weibullShape, weibullScale), 
            anglePar0 = c(wcauchyMu, wcauchyRho),
            stepDist = "weibull", angleDist = "wrpcauchy")

MRW: two state model - output

Value of the maximum log-likelihood: -1397.219 

Step length parameters:
----------------------
      state 1   state 2
shape  2.5679 0.9615813
scale 38.7629 1.0270557

Turning angle parameters:
------------------------
                  state 1    state 2
mean          -0.01419219 -0.2208763
concentration  0.85071975  0.3227927

Regression coeffs for the transition probabilities:
--------------------------------------------------
             1 -> 2    2 -> 1
intercept -4.035819 -5.045129

Transition probability matrix:
-----------------------------
            [,1]       [,2]
[1,] 0.982635653 0.01736435
[2,] 0.006399412 0.99360059

Initial distribution:
--------------------
[1] 9.999994e-01 6.328919e-07

MRW: two state model - confidence intervals

$stepPar
$stepPar$lower
        state 1   state 2
shape  2.252089 0.8857300
scale 35.954625 0.9085821

$stepPar$upper
        state 1  state 2
shape  2.927998 1.043928
scale 41.790508 1.160977


$anglePar
$anglePar$lower
                  state 1    state 2
mean          -0.06121604 -0.4458002
concentration  0.81355297  0.2507313

$anglePar$upper
                state 1      state 2
mean          0.0317344 -0.002439682
concentration 0.8805810  0.390922736


$beta
$beta$lower
             1 -> 2    2 -> 1
intercept -5.433848 -6.435466

$beta$upper
             1 -> 2    2 -> 1
intercept -2.637791 -3.654792

MRW: two state model - plotting

Decoding states sequence... DONE

plot of chunk unnamed-chunk-24plot of chunk unnamed-chunk-24plot of chunk unnamed-chunk-24

MRW: two state model - states

Decoding states sequence... DONE
Computing states probabilities... DONE

plot of chunk unnamed-chunk-25

MRW: fit three state model

weibullShape = c(3, 1, 1)
weibullScale = c(20, 1, 2)
wcauchyMu = c(0, pi, 0)
wcauchyRho = c(0.9, 0.2, 0.6)

tripleStateModel = fitHMM(Lamprey.hmm, nbStates = 3, 
            stepPar0 = c(weibullShape, weibullScale), 
            anglePar0 = c(wcauchyMu, wcauchyRho),
            stepDist = "weibull", angleDist = "wrpcauchy")

MRW: three state model - plotting

Decoding states sequence... DONE

plot of chunk unnamed-chunk-27plot of chunk unnamed-chunk-27plot of chunk unnamed-chunk-27

MRW: three state model - states

Decoding states sequence... DONE
Computing states probabilities... DONE

plot of chunk unnamed-chunk-28

MRW: Model fitting - start values

for (i in 1:n)
{
  weibullShape = runif(nstates, 0, 50)
  weibullScale = runif(nstates, 0, 50)
  wcauchyMu = runif(nstates, -pi, pi)
  wcauchyRho = runif(nstates, 0, 1)

  startVals[i,] = c(weibullShape, weibullScale, wcauchyMu, wcauchyRho)

  tryCatch(
  {
    model = fitHMM(Lamprey.hmm, nbStates = nstates, stepPar0 = c(weibullShape, weibullScale), 
                    anglePar0 = c(wcauchyMu, wcauchyRho),
                    stepDist = "weibull", angleDist = "wrpcauchy")


    angleMean[i,] = model$mle$anglePar[1,]
    angleCon[i,] = model$mle$anglePar[2,]
    stepShape[i,] = model$mle$stepPar[1,]
    stepScale[i,] = model$mle$stepPar[2,]    
    nll[i] = model$mod$minimum
  }, 
  error = function(e) print(paste("i =", i, conditionMessage(e)))
  )
}

MRW: Model fitting - start values

plot of chunk unnamed-chunk-31plot of chunk unnamed-chunk-31

MRW: Model fitting - parameter estimates

plot of chunk unnamed-chunk-32

MRW: Model fitting - model comparison

AIC to choose among models

AIC(doubleStateModel, tripleStateModel)
             Model      AIC
1 tripleStateModel 2783.884
2 doubleStateModel 2816.437

However, AIC will frequently select more states than may be biologically relevant or interpretable.

MRW: Model fitting - three state model

ThresStateModelComparison

Multistate random walk

  • Include centers of attraction, i.e. central place foraging (McClintock et a. 2012)
  • Autocorrelation in step lengths and turning angles (Jonsen et al. 2005)
  • Review paper with advice on model fitting, errors, etc. (Jonsen et al. 2013)

Expectation-Maximization Binary Clustering

Clustering method (default is speed and turning angle, but can specify other variables) using Gaussian mixture model (Garriga et al. 2016)

  • tries to balance strong assumptions/ease of use (e.g. HMM) and statistical soundness
  • can post-process results to account for behavioral correlation
  • can weight observations by uncertainty

EMbC: Lamprey

require(EMbC)

# here we use the speeds and turning angles we already calculated,
# you can also pass in a move object directly
clustering = embc(as.matrix(Lamprey.VT[, c("V", "Theta")]))
... computing starting delimiters 
[1]   0  -0.0000e+00       4       428
[1]   1  -6.1054e+00       4       320
[1]   2  -5.9219e+00       4        96
[1]   3  -5.2132e+00       4        41
[1]   4  -4.3543e+00       4        39
[1]   5  -4.1245e+00       4        24
[1]   6  -4.0369e+00       4        12
[1]   7  -4.0044e+00       4        13
[1]   8  -3.9879e+00       4         5
[1]   9  -3.9819e+00       4         6
[1]  10  -3.9785e+00       4         2
[1]  11  -3.9717e+00       4         7
[1]  12  -3.9623e+00       4         6
[1]  13  -3.9574e+00       4         1
[1]  14  -3.9559e+00       4         1
[1]  15  -3.9550e+00       4         2
[1]  16  -3.9549e+00       4         4
[1]  17  -3.9544e+00       4         2
[1]  18  -3.9541e+00       4         0
[1]  19  -3.9538e+00       4         1
[1]  20  -3.9537e+00       4         0
[1]  21  -3.9536e+00       4         2
[1]  22  -3.9535e+00       4         0
[1]  23  -3.9535e+00       4         0
[1] ... Stable clustering

EMbC: Lamprey plot output

plot of chunk unnamed-chunk-35

EMbC: Lamprey plot output

plot of chunk unnamed-chunk-36

Comparison of all methods

LampreyResults

Exercise: simulated data

plot of chunk unnamed-chunk-37

Exercise: simulated data

  1. Using Nu.sim, Tau.sim, BCRW.sim in waddle, apply methods
  2. What do you think about the assumptions, strengths, weaknesses of each method?
  3. What can you determine about each simulated track?
data(Multipaths)

# ltraj object
Nu.traj = as.ltraj(data.frame(X = Re(Nu.sim$Z), 
                              Y = Im(Nu.sim$Z)), 
                   Sys.time() + 1:length(Nu.sim$Z), 
                   id = "Nu.sim")

# bcpa object
Nu.VT = GetVT(data.frame(Z = Nu.sim$Z, 
                         Time = Sys.time() + 1:length(Nu.sim$Z)), 
              units = "sec")

# moveHMM object
Nu.hmm = prepData(Nu.traj[[1]], type = "UTM", coordNames = c("x", "y"))

Exercise: simulated data

SimulatedTracks

Exercise: real data

For this section, use your own data, the Wolf data in the waddle package or another dataset

plot of chunk unnamed-chunk-39

Wolf data

Daily positions

data(Wolf)
Wolf2 = data.frame(X = tapply(Wolf$X, 
                               substr(Wolf$Time, 1, 10), mean), 
                    Y = tapply(Wolf$Y, 
                               substr(Wolf$Time, 1, 10), mean))
Wolf2$Time = as.POSIXct(row.names(Wolf2))

Wolf.traj = as.ltraj(data.frame(Wolf2$X, Wolf2$Y),
                      as.POSIXct(Wolf2$Time), 
                      id = "Wolf")